Structural correlations and melting of B-DNA fibres 
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Despite numerous attempts, the understanding of the thermal denaturation of DNA is still a chal- 
lenge due to the lack of structural data at the transition since standard experimental approaches 
to DNA melting are made in solution and do not provide spatial information. We report a mea- 
surement using neutron scattering from oriented DNA fibres to determine the size of the regions 
that stay in the double-helix conformation as the melting temperature is approached from below. 
A Bragg peak from the B-form of DNA has been observed as a function of temperature and its 
width and integrated intensity have bean measured. These results, complemented by a differential 
calorimetry study of the melting of B DNA fibres as well as electrophoresis and optical observation 
data, are analysed in terms of a one-dimensional mesoscopic model of DNA. 



I. INTRODUCTION 



The X-Ray diagrams published by Wilkins et al. 
and Franklin et al. [2|] in the same issue of Nature as 
the famous paper of Watson and Crick Q describing the 
structure of DNA revealed the significance of the fibre 
form of DNA in providing oriented samples necessary for 
structural studies. These images, shoviring the cross pat- 
tern typical of a helix and two strong spots associated 
to the stacking of the bases in B-DNA, were however il- 
lustrating only one aspect of the molecule, its average 
static structure. In reality the DNA molecule is a highly 
dynamical object. Its base pairs fluctuate widely. The 
lifetime of a closed base pair is only of the order of 10 
ms The local opening of the pairs is important for 
biological function as it allows the reading of the genetic 
code. When temperature is raised above the physiologi- 
cal range, thermally induced base-pair openings become 
more cooperative, leading to the so-called "DNA bub- 
bles" i.e. open regions which may extend over tens of 
base pairs. At sufficiently high temperature they extend 
over the full molecule and the two strands separate from 
each other. For a physicist the thermal denaturing of 
DNA, also known as DNA "melting" , is a phase transi- 
tion, which is particularly interesting because it occurs 
in an essentially one-dimensional system. DNA melting 
started to attract attention soon after the discovery of 
the double helix structure and was widely studied, 

providing insights on the interactions within DNA, the 
influence of base pair sequence on DNA unwinding, and 
the effect of the solvent on DNA stability . It recently 
attracted a renewed interest thanks to High Resolution 
Melting methods (HRM) Q whereby precise denaturing 
profiles provide a new tool for biology laboratories. 



Despite numerous attempts, the understanding of this 
remarkable thermodynamic phase transition is still a the- 
oretical challenge. Statistical physics of DNA thermal 
denaturing has a long history [7] because it raises the 
fundamental question of a phase transition in a one- 
dimensional system, but also for practical applications 
such as the design of Polymerase Chain Reaction (PCR) 
probes or the HRM studies for biology. The models for 
DNA denaturation fall in two classes. First, Ising models 
treat a base pair as a two-state system, which is either 
closed or open. This is the case of the prototype Poland- 
Scheraga model [9] . Those models are appealing for their 
simplicity and because their parameters have been well 
calibrated. However their drawback is that they need 
a large number of phenomenological parameters and, for 
genomic sequences, the calculation can become heavy due 
to non-local entropy contributions. The second class of 
models goes beyond a description in terms of two-state 
systems by incorporating some elements of the structure. 
The Peyrard-Bishop-Dauxois (PBD) model fio'l is still 
simple because it represents the status of a base pair by 
a single real number measuring the stretching of the bond 
between the bases, but contains nevertheless a minimal 
structural information relevant for structure factor cal- 
culations. In its simplest version this model allows a fast 
calculation of melting curves of long natural DNAs with 
only 7 parameters [ll[. However the success of differ- 
ent models in describing complex melting profiles [ill . Il2j 
shows that the correct fit of those curves is not a suffi- 
cient test to validate a theory. Examination of further 
observables, with a more direct link to the underlying 
structural details, appears necessary. As for other classes 
of phase transitions in physics, such as magnetic systems, 
an important feature that characterises the nature of a 
transition is the growth of the size of the correlated do- 
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mains as the transition is approached [l3j. 

Traditional methods to investigate the DNA melting 
transition cannot provide this kind of spatial informa- 
tion. The standard experimental method is to record the 
sharp increase of UV absorbance at 260 nm which is asso- 
ciated with the un-stacking of the base pairs, while slowly 
heating a dilute DNA solution. Other approaches rely on 
circular dichroism measurements or calorimetric studies 
that measure the heat absorbed at the transition. Al- 
though melting curves, showing the fraction of open base 
pairs versus temperature, may exhibit a multi-step be- 
haviour related to the sequence, none of the experiments 
are actually sensitive to spatial information, such as the 
size of the intact regions of the double helix. 

This structural information, which is essential to un- 
derstand the nature of a phase transition, has been lack- 
ing. Neutron scattering can provide this missing piece of 
information, provided the experiment can be performed 
on an oriented DNA sample. As shown by the historical 
studies that revealed the double-helical structure P, 
this is possible with fibre diffraction. The methods to 
produce fibre samples have been refined, by controlling 
the ionic and water content, it is now possible to make 
high quality fibres with various configurational structures 

mm 

Here we report a measurement using neutron scatter- 
ing from oriented DNA fibres (Sec. |n|, to determine the 
size of the regions that stay in the double-helix confor- 
mation as the melting temperature is approached from 
below and we show how it can be analysed in terms of 
the one-dimensional mesoscopic PBD model of DNA (Toj 
fSecs lIIll and llVI) . We recently published a brief report of 
those results 16], which are here presented with further 
data and discussion. 



II. EXPERIMENTS 

Due to the regular stacking of the base pairs, the B- 
form of DNA can be viewed as a one-dimensional diffrac- 
tion grating. This is refiected in a strong Bragg peak 
for a longitudinal component of the scattering vector 
Q\\ ~ 1.87 A~^, associated with the average distance 
a = 3.36 A between consecutive base pairs. The principle 
of our experiment is simple: by following the evolution of 
this peak as temperature is raised from room tempera- 
ture to the denaturation temperature Tc, we can monitor 
the breaking of this "diffraction grating" into pieces sep- 
arated by denatured regions, where the base stacking is 
destroyed. We expect a strong broadening of the diffrac- 
tion peak as Tc is approached. The width of the peak 
allows us to determine how the average size of the intact 
double helical domains evolves when DNA approaches its 
denaturation temperature, which is critical information 
for a theoretical analysis of the transition. The interest of 
this method that focuses on a single, intense, diffraction 
peak is that we precisely collect the information of in- 



terest in a measurement which is only weakly perturbed 
by sample imperfections. In fibres, the B-form of DNA 
is semicrystalline [l3|- The misalignment of the DNA 
molecules has been estimated to be less than 5 degrees 
[TtI . Its effect on the projection of the base pair distance 
on the axis is less than 0.5 %, i.e. negligible with respect 
to other effects such as the variation of the inter base pair 
distance as a function of the sequence. As a result, for a 
cut in reciprocal space along the fibre axis, the width of 
the Bragg peak is not affected. Moreover, by performing 
a scan off-centre, i.e. with a scattering vector Q which has 
a non-zero component orthogonal to the molecular axis, 
we can also probe the displacement of the base pairs in 
the transverse direction. Such a scan is not immune from 
the influence of the misalignment of the molecules which 
broaden the peak, but it provides interesting data on the 
fluctuations due to the opening of the base pairs in the 
vicinity of the thermal denaturation. 



A. Materials and methods 

Sample preparation 

The samples were made from natural DNA extracted 
from salmon testes (Fluka) . The DNA had been oriented 
using a "spinning" technique [l3], whereby the DNA is 
precipitated out of a 0.4 M lithium salt solution, drawn 
to a fibre and then wound around a bobbin to make a 
film of parallel fibres. The samples are then dried, cut 
from the bobbin, and then stored for a number of weeks 
in an atmosphere humidified to 75% using ^Il2 0. This 
fixed the water content of the DNA, ensuring a B-form 
configurational structure and significantly reduced inco- 
herent neutron scattering from protonated hydrogen in 
the sample. The B-form was confirmed using x-ray and 
neutron diffraction. 

The neutron scattering samples were folded in con- 
certina fashion, thus preserving the fibre axis direction. 
The samples were then placed in a niobium envelope and 
sealed between aluminum plates using lead wire for the 
seal. The sample cassette was therefore airtight which 
maintained the water content throughout the experi- 
ment. The sample mass was 1 g. 

Neutron scattering 

Preliminary diffraction measurements to establish the 
configurational form were carried out using the IN3 three 
axis spectrometer at the Institut Laue-Langevin, France. 
The instrument was configured with a pyrolytic graphite 
(PG) monochromator and analyser, and the wavelength 
was set to 2.36 A (= 14.7 meV). The Q-resolution was de- 
fined by 60' coUimation before and after the sample, and 
higher order wavelength contamination was suppressed 
using a PG filter. The instrument was used to measure 
reciprocal space maps, and an example is shown in figure 

ra. 

The neutron three axis spectrometer INS, also at ILL, 
was used to measure the main Bragg peak as a function 
of temperature. This instrument was configured with 
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FIG. 1. Reciprocal space map of B-form Li-DNA. The axes 
are the momentum transfer parallel {Q\\) and perpendicular 
{Q±) to the fibre axis. The strong Bragg peak is observed at 
Q|| =2-K ja where a k, 3.36 A is the average distance between 
the base pairs along the fibre axis. A powder diffraction peak, 
coming from the lead wire used to seal the cassette, is just 
visible at Q — 2.2 A. Also marked on the figure are the two 
standard scans that were repeated at all temperatures. The 
figure is identical to Fig. 2 [la ]. 

a PG monochromator delivering an incident wavelength 
of 1.53 A ( = 35 meV). The Q-resolution was defined 
with 40' collimation before and after the sample, and was 
measured by making a reciprocal space map of the (111) 
Bragg peak from a silicon single crystal. No energy anal- 
ysis was used, and the static approximation was assumed 
to hold for the measurements. Temperature control was 
achieved using a liquid helium cryofurnace. 

Two scans were repeated at all temperatures. Their 
trajectories were calculated for nominally elastic scatter- 
ing and are shown in figure [T] Scan 1 was along the fibre 
axis, through the centre of the Bragg peak. The Q± for 
scan 2 was chosen such that, when Qn = 27r/a, the di- 
rection of the scattered beam would be perpendicular to 
the fibre axis. This type of scan has been used to mea- 
sure critical phase transitions in low dimensional magnets 
and assists the static approximation [l3l |. The tempera- 
ture steps close to the melting transition were very small 
(0.1 K) and measurements at a given temperature were 
repeated numerous times to ensure thermal equilibrium 
and reproducibility. Examples of the scans at different 
temperatures are shown in figure [2j 

The data were fitted with the Lorentzian function 

5(0,1) = s (o,„.) d. . I (r/2)^ +Vo|, - Qor ' 

(1) 

where Qq is the peak centre, Iq is the integrated inten- 
sity and F is the width. The function was convoluted 
with the instrument resolution. A second Lorentzian cen- 
tred at Qii ~ 1.5 A~-^ was needed to fit the scan 2 data. 
The amplitudes for these two peaks were free parame- 
ters, however their widths were set to be equal in the 
fits. Examples of the fits are also shown in figure [21 




FIG. 2. Examples of the scans shown in figure[T] The data at 
299 K represent the starting point for the experiment. The 
sample is in the melting transition at 348.8 K. The fibre struc- 
ture has collapsed at 349.1 K. The data have been fitted with 
eg nation [l] and the fits are also shown. The figure is identical 

to Fig. 2 [g. 

Calorimetry 

Difi'erential Scanning Calorimetry studies have been 
performed with a DNA film identical to the one used for 
neutron scattering, but prepared from a different DNA 
solution. Two different samples were used, with masses 
of 49 mg and 45.5 mg. The samples cut in the film were 
rolled and hermetically sealed into the hastelloy sample 
tube of a Setaram Micro DSC III calorimeter. The refer- 
ence tube of this differential calorimeter was empty. After 
a cooling to 278 K the temperature T has been raised to 
368 K or 383 K at a rate of 0.6 K/min, maintained for 
10 min at the maximum temperature and decreased to 
278 K at the same rate of 0.6 K/min. The differential 
heat flux A0 has been measured as a function of time 
(temperature) and the specific heat has been obtained 

from + '^^^t^^ I {dT /dt) where r is the thermal time 

constant of the calorimeter (here r = 60s) [Tsi |. 

Optical observations 

A small piece of DNA film has been sealed between 
two glass plates to preserve its water content while it 
was heated on a hot plate below an optical microscope 
at the rate of 1 K/min. The sample was lighted and ob- 
served from the top as it went through the denaturation 
temperature of DNA. 

Gel electrophoresis 

A small piece of the sample (0.01 g taken before and 
after heating in the neutron scattering experiment) was 
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dissolved in water. The solution has be used to run a 
standard gel electrophoresis experiment, using a 1% non- 
denaturing agarose gel and stained with Ethidium Bro- 
mide. Comparisons with DNA mass ladders were used to 
measure the length of the DNA fragments in the sample. 

A similar experiment was performed with the solution 
used to prepare the DNA fibres to probe the state of the 
DNA molecules prior to any treatment. 



B. Experimental results 

Figures [5] and [3] illustrate the main result given by neu- 
tron diffraction. The integrated intensity of the Bragg 
peak stays constant from room temperature to about 
T = 346 K. At this temperature it starts to show a small 
decrease occurring on a temperature range of about 3 K, 
followed by an abrupt drop. A more careful examination 
of Fig. |3] exhibits the following results: 

• The intensity of the peak observed in scan 1 is 
larger than that of the off-centre peak in scan 2, 
as expected, but its evolution versus temperature 
is remarkably similar in both scans. 

• The width of the off-centre peak (scan 2) is signif- 
icantly larger than the width at the centre of the 
diffraction spot. 

• In the 300 — 340 K temperature range the width of 
both peaks is essentially constant and even shows 
a slight decrease which can be attributed to an an- 
nealing of the sample as shown by Fig. 2] In the ex- 
periment shown in this figure, another sample was 
heated up to 340 K then cooled down to 330 K and 
heated up to 340 K again. During the first heat- 
ing the width of the peak decreases, but then, in 
the cooling stage it keeps its lowest value, as well 
as during reheating. In this experiment the inte- 
grated intensity of the peak stays constant for all 
temperatures. 

• For scan 1, Fig. [3]-c shows that, in the vicinity of 
the transition, where the annealing has been com- 
pleted, the width of the Bragg peak is remark- 
ably constant until the temperature where the in- 
tensity drops abruptly. This width does not show 
any precursor effect, even when the intensity of the 
mode starts to decrease in the temperature range 
346 < r < 349 K. On the contrary for scan 2 (off- 
centre) the width of the peak shows a gradual in- 
crease in this temperature range, which appears to 
mirror the decrease of the intensity of the peak. 

The drop of the Bragg-peak intensity, shown in Fig. [3] 
is extremely sharp in terms of temperature, but requires 
some time to complete. This is shown in Fig. [5l When 
the transition was reached and the intensity started to 
drop, we stopped cooling and observed the time evolution 
of the peak at constant temperature T = 349.05 ±0.03 K. 
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FIG. 3. Integrated intensity (a) and width 2ar (dimension- 
less) (b) of the Bragg peaks versus temperature. The small 
discontinuity at T ~ 345 K is due to a small misalignment 
of the instrument that was discovered and subsequently cor- 
rected. Results for scan 1 use empty symbols while results 
for scan 2 are plotted with filled symbols. The bottom panel 
(c) shows the temperature evolution of the intensity (circles) 
and width of the peak in the immediate vicinity of the melt- 
ing transition. The evolution of the intensities are the same 
for both scans, hence data is only shown for scan 1. The 
widths are shown for scan 1 (open squares) and scan 2 (closed 
squares) . 
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FIG. 4. Experiment showing the anneaUng of the sample due 
to heating to moderate temperature. The data correspond to 
a scan of type 1. The Bragg peak gets sharper on heating 
(open circles) and keeps its lower width if it is subsequently 
cooled and heated again. 
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FIG. 5. Time dependence of the integrated intensity (a) 
and width (b) of the Bragg peak at fixed temperature T — 
349.05 ±0.03 K where the sharp drop of the peak intensity 
occurs. 



Figure [5] shows that the intensity needed about 3 hours to 
stabilise to a low value, and the width was not stabilised 
before about 6 hours. 

In the fibre sample the transition that corresponds to 
the almost complete vanishing of the Bragg peak associ- 
ated to the stacking of the base pairs is not reversible. On 
cooling we did not observe a reappearance of the peak, 
although a very small recovery can be observed on a mag- 
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FIG. 6. Plot of the Bragg peak remaining after heating the 
sample to 355 K (closed symbols) and after subsequent cool- 
ing to room temperature (open symbols). The sharp peak 
around 2.2 A^^ is due to the aluminium of the sample holder. 



nified picture of the Bragg peak recordered at room tem- 
perature after cooling as shown in Fig. [51 

Figure[7]shows the specific heat of a DNA film obtained 
by Differential Scanning Calorimetry (DSC). The sharp 
peak at 360 K can be attributed to the thermal denat- 
uration of DNA although the transition temperature Tc 
cannot be quantitatively compared with neutron obser- 
vations because the measurement was done on a different 
sample. Tc strongly depends on external conditions, and 
particularly the ionicity of the solvent [l^ so that the 
shape of the melting curve is more significant than the 
value of the denaturation temperature when comparing 
samples. The transition is not reversible, and the sharp 
peak does not reappear if the sample is cooled and the 
measurement repeated [20j . 



The fibre structure of the film is clearly visible in op- 
tical microscopy observations of heated DNA films until 
the denaturation temperature is reached. Then this or- 
ganised structure of parallel fibres is essentially lost and 
the film tends to shrink (Fig. |5]). 

Gel electrophoresis shows that the length of the DNA 
molecules in the solution used to prepare the film, or in 
a piece of film which has not been heated, is of the or- 
der of 20 kilo-base or larger. However the same measure- 
ments performed on a piece of film which has been heated 
up to DNA denaturation and film collapse, and subse- 
quently cooled to room temperature, only detects DNA 
fragments of a few hundreds of bases, indicating that the 
original molecules have been chopped in the thermal cy- 
cle (Fig. ED. 
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FIG. 7. Specific heat, Cp, of a B-form Li-DNA film similar 
to the film used in neutron scattering experiments (full line, 
left scale) obtained by DSC. The dash line shows the theoret- 
ical denaturation profile of the sequence used for the analysis 
fSection IIII|) : derivative with respect to temperature of the 
fraction k of open base pairs. The figure is identical to Fig. 1 




FIG. 8. Optical microscopy observation of a piece of film prior 
(a) and after (b) heating above the denaturing temperature 
of DNA. 



III. ANALYSIS 

To analyse the neutron diffraction results, we need 
to proceed in two steps. First, we must determine the 
diffraction pattern of a finite segment of double stranded 
DNA, taking into account the local inhomogeneities in 
its structure which are associated to base pair sequence, 
and the thermal fluctuations. Second we must study the 
statistical physics of DNA to determine the size distri- 
bution of the closed segments of DNA as a function of 
temperature. 



A. Structure factor of a closed DNA segment. 

We consider a structurally disordered linear chain of 
M sites. Let a, the average base-pair spacing, be the 
average distance between successive sites and Xj the local 
structural deviation from that value between j th and 




FIG. 9. Electrophoresis image showing the length of the DNA 
molecules before and after heating of the film. From left to 
right: DNA mass ladder SM0321 (100-3000 base pairs) (lane 
1), solution used to prepare the DNA film (lane 3), solution 
prepared from a piece of film that has not been heated (lane 
4), solution prepared from a piece of film after heating (lane 
6), DNA mass ladder SM0311 (250-10000 base pairs) (lane 8). 
Lanes 2,5,7 were not used. 



(j -|- 1) st sites; the equilibrium positions of m th and n th 
sites differ by i'm — 'n)a + J2^^=n+i ^"i- Structural disorder 
in the transverse direction is similarly expressed by rjj, 
the local deviation (from zero) of the distance between 
j th and {j + 1) st sites in the transverse direction. The 
structure factor of such a finite chain segment is given by 



M 



Sm{Q) — — ^ g»Qiia(™-")^g«Z;"=™+i(Qii^j+Qi'73)^ 

m,n— 1 



where Xm,ym represent, respectively, the longitudinal 
and transverse displacements of the mth site from its 
position at thermal equilibrium, and Q the scattering 
vector, having the component Qn along the helix axis 
and Q± orthogonal to it. Equation ([2]) is a slight gener- 
alisation of the finite para-crystal theory |21[ to account 
for disorder and motion in the transverse direction. The 
angular brackets denote averages which can be decoupled 
because the first refers to structural disorder and the sec- 
ond to thermal motion. To a first approximation, struc- 
tural disorder is modelled by Gaussian variables {Xj} and 
{Vj} with zero average and 



(3) 



where 5ij is the Kronecker symbol {5ij = 1 ii i = j and 
otherwise). We use estimates of the variances (A^) 
and (77^) obtained from conformational analysis and 
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present alternative calculations for uncorrelated (x = 0) 
and correlated (x ^ 0) structural disorder. Thermal fluc- 
tuations in the longitudinal displacements can be calcu- 
lated in the harmonic approximation; thermal fluctua- 
tions in the transverse direction can also be calculated 
(cf. next subsection for a particular model). We will de- 
scribe them in an approximate fashion which takes into 
account the thermal effects due to sequence heterogene- 
ity, i.e. local variations in = {Vj+i) — iUj)] then 



_{e)\m-n\/2 



(4) 



where (^^} is the variance of the {■Cjl's. 

Putting the various terms together and performing one 
of the two summations [2lj results in 

Af-l 

Sm{Q) = Af 4- 2 ^ (A/ - n) cos(gpa)e-"'^ , (5) 

n=l 

where 

+ 2g||QiX((ry')(A2))'/' (6) 

and cr^ = ksT / iicqO^ , the Debye-Waller correction due 
to longitudinal thermal motion at any temperature T, 
can be obtained from the total DNA mass per base pair 
fi = 618 a.m.u. and the measured 23] sound velocity 
Co — 2830 m.s^^; fc^ is the Boltzmann constant. 

Near the first Bragg peak, which is where the present 
experiment focused, and for sufficiently large cluster 
sizes, the sum (O can be approximated by 



SliiSl) ^ MS'iQ) = M 



sinh A 



cosh A — cos{Qna) 



(7) 



B. Statistical physics of the closed regions of DNA. 

To calculate the size distribution of the closed segments 
of DNA by a statistical physics analysis, we selected the 
PBD model To\ which is sufficiently simple to allow the 
analysis of DNA segments of thousands of base pairs, but 
nevertheless includes some data on the structure of the 
molecule which are necessary to calculate the structure 
factor. Moreover, as it describes the molecule in terms 
of a Hamiltonian, its parameters are directly linked to 
physical quantities. Therefore they are easier to deter- 
mine than for Ising models, although they still need to 
be refined by comparison with a variety of experimental 
melting profiles. 

The configuration energy of a DNA molecule of N base 
pairs is written as 



N 



(8) 



where yj represents the stretching of the j*'' base pair, 
due to the transverse displacements of the bases. The 
stacking interaction between adjacent bases is described 
by the anharmonic potential [lO| 



Wiyj,yj+i) = -k 



pe 



-Hyj+Vi+i) 



iVo-Vj+if (9) 



which takes into account the weakening of the interac- 
tions when the pairs are broken. The potential Vj {yj ) — 
Dj[l — exp{—ajyj)] is a Morse potential which describes 
the combined effects of hydrogen-bonding, electrostatic 
interactions between the charged phosphate groups, and 
solvent effects on the the j*'* base pair. The 4 possible 
bases. A, T, G, C form two types of pairs, A — T, linked 
by two hydrogen bonds, and G — C, linked by three hy- 
drogen bonds. Both the stacking interactions W and 
the intra-pair potential V are affected by the sequence 
of bases. However, although subtle sequence effects on 
short DNA fragments may require the introduction of 
different stacking potentials for different interacting pairs 
[23|. the melting curves of long DNA molecules, with sev- 
eral thousands base pairs, can be accurately reproduced 
by introducing the effect of the sequence in the intra-pair 
potential Vj only , which drastically reduces the num- 
ber of model parameters. Therefore in our calculations 
the stacking interaction is treated as homogeneous. 

To determine the melting curve or calculate the size of 
the closed regions, we need to give a quantitative defini- 
tion of a closed base pair. This can be done by choosing a 
reference stretching yc- Base pair j is considered as closed 
if yj < yc- We select yc — 1-5 A, which corresponds to a 
base pair whose stretching is well on the plateau of the 
Morse potential. The results are weakly dependent of the 
value of j/c provided it is larger than 1 A because molec- 
ular dynamics simulations show that, once a base pair as 
been stretched to a value that brings it on the plateau of 
the Morse potential, it is likely to open widely. 

The statistical weight of a given configuration of the 
molecule thermalised at temperature T is 



dy 



N 



„~Hy(yuy2,...yN)/(kBT) 



(10) 



The limits of the lower and upper bounds of the integrals 
yij 1 Unij depend on the particular configuration. Setting 
yi- — —CO, ymj = -l-cxD for all j allows the molecule to ex- 
plore its full configurational space and Z is then the par- 
tition function Z. Setting y;. = — oo, ym = yc defines a 
configuration in which base pair j is closed while yi- = y^ 
yrrij = +0O defines a configuration where it is open. The 
integrals associated to all those configurations can be eas- 
ily calculated because the model is one-dimensional and 
restricted to nearest-neighbour coupling so that, instead 
of a highly multidimensional integral, one has to compute 
a chain of one-dimensional integrals [ll|, [l^, involving 
a kernel which depends on the stretching at two adjacent 
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sites. Moreover the speed of the calculation can be signif- 
icantly increased by expanding the site-dependent kernel 
on the basis of the eigenfunctions of a reference kernel, 
for instance the kernel associated to an ^ ~ T base pair 

El- . 

These calculations allow us to obtain the probability 

V{m, i) that m adjacent sites, starting at site i are closed, 

= ^Z{yi,y2, . . < VcVi+i < Vc, ■ ■ ■ , 

...yi+m-i <yc,yi+m,---) , (H) 

by computing the statistical weight of configurations 
where restrictions on the integration range are imposed 
for all sites belonging to that closed region and no re- 
strictions are imposed elsewhere. Then these quantities 
give the probability P{m, i) that a closed cluster of size 
m, with open ends, exists at site i through 

P{m, i) = Vim, i) - Vim + 

-r{m + l,i) +V{m + 2,i- 1) 

(12) 

from which the probability to have a closed cluster of size 
TO in a DNA segment of A^o base pairs is simply 



^ No 

P{m) = — Y^P{m,i) . 



(13) 



The average size of a cluster of closed base pairs is then 



^ mP{ni] 



l-h-Po 



(14) 



where h is the helix fraction and Pq the statistical weight 
of two consecutive bases being in the closed state. Similar 
calculations for the open regions of the DNA molecule 
give the average size of the denatured regions lo — {1 — 
h)/{l — h — Pq). These quantities are computed for a 
range of cluster sizes from 1 to a maximum value M. 
To avoid end effects we study a DNA segment of size 
A^ = A'o + 2Af and the sites i are chosen so that the 
clusters that we consider are formed of the bases pairs 
which are at least M sites away from the ends. 

For a natural DNA sample, which may contain mil- 
lions of base pairs, the intensity observed in a neutron 
scattering experiment is proportional to 



^(Q) = ^M^m(Q) 



(15) 



m—l 



where S'm(Q) is given by Eq. ([5]), and the size of the DNA 
molecule has been extrapolated to infinity. 

In practice the calculation is performed with segments 
of natural DNA which include 280000 base pairs, and 
P(m, i) is computed up to cluster sizes M of a few hun- 
dreds of base pairs (typically M = 150). It is however 



easy to determine Pirn) for large to because it scales ex- 
ponentially with TO [29|, as shown in Fig. llOl -b. Fitting 
the numerical data of log(P(m) for 40 < m < M by a 
straight line we get P{m) — PoC™ foi' ™ > so that in 
practice the calculation of S'(Q) by Eq. ([15]) is expressed 
as 



M 



Mo 



m—l 



5'(Q)PoC 



Mo + l 



A/n 



Li-C {i-Cf 



(16) 



where the summation for to > Aio (Mo — 1000) has 
been calculated analytically using the property that, for 
large to, S'm(Q) can be approximated by the limiting 
form ([7]). The same method can be used to compute the 
average cluster size Ic (Eq. ([H])) versus temperature. A 
typical result is shown on Fig. [TOlc. Then the structure 
factor is fitted with the same Lorentzian expression as 
the one used to analyse experimental data to determine 
the integrated intensity /q and width F of the diffraction 
peak. 



C. Model parameters 

To analyse of the neutron scattering experiments, in 
principle we would need to know the base-pair sequence 
in the sample. As the experiment requires a significant 
amount of DNA it can only be performed with natural 
DNA. The salmon testes DNA that we use is provided 
without its sequence js^] , and even its GC content is only 
approximately known. It is estimated to be 41.2%. 

The theoretical analysis has been tested on different 
DNA sequences from the genome of Danio rerio (ze- 
brafish) [3l| and Pyrococcus abyssi [s^l- The results pre- 
sented in Fig. [To] have been obtained with a sequence 
of 280000 bases, part of the full genome of Pyrococcus 
abyssi, chosen because its denaturation curve is the clos- 
est to the denaturation profile of our samples measured 
by differential scanning calorimetry (Fig. [7]). The GC 
content of this fragment is 44.08% 

Model parameters have been obtained from an exten- 
sive study of DNA denaturation on various sequences 
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which determined a set of parameters allowing the 
prediction of melting curves of various DNA sequences 
to a good accuracy. Experiments to record DNA denat- 
uration curves are generally performed in solution with 
sodium salt, the Na+ ions being necessary to stabilise 
DNA by screened the repulsions between the charged 
phosphate groups. However our neutron scattering ex- 
periments have been performed with Li-DNA because its 
secondary structure is more stable, and stays in the B- 
form over a much higher humidity range, than that of 
Na-DNA m, 1^. Therefore the parameters of the AT 
and GC Morse potentials cannot be determined unam- 
biguously. The values that we use correspond to a high 
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sodium content. They have been chosen on the basis of 
the shape of the mehing curve rather than the value of 
the denaturation temperature Tc- This is why the com- 
parison between theory and experiments must be made 
with the reduced temperature 6 — T/T^ rather than 
with the actual temperature. The parameters selected 
for the analysis are k — 0.00045 eV/A^, p = 50 and 
h = 0.20 A-i, aAT = 4.2 A-\ aac = 6.9 k'^ and 
Dat = 0.12905 eV, Dae = 0.16805 eV. 



D. Results 

Figure [TU] summarises the main results of the theoret- 
ical analysis by showing the variation versus tempera- 
ture of the integrated intensities and widths of the Bragg 
peaks in scan 1 and 2, as they are predicted by the model. 
As expected the intensity shows a sharp drop near the 
denaturation transition. It reflects the openings of the 
base pairs that break the clusters of stacked pairs giving 
rise to the investigated Bragg peak and therefore reduces 
the number of scattering sites. As a result the integrated 
intensity of the peak almost provides a quantitative mea- 
sure of the helix fraction of DNA because, as shown by 
Eq. (O, for sufhciently large clusters, the structure fac- 
tor is proportional to the number of sites in a cluster. 
Moreover Fig. [TOl c shows that the size of the clusters 
significantly drops only in the last stage of the denatura- 
tion. 

The width of the Bragg peak provides the spatial infor- 
mation that standard observations of DNA denaturation 
cannot give. It is strongly sensitive to the distribution 
of the sizes of the diffracting clusters. The drop of the 
average cluster size in the last stage of the transition 
(Fig.[TO]-c) is reflected in the large increase of the width of 
the Bragg peak predicted by the theory in the high tem- 
perature range (Fig. [TOl-d) . For scan 2, with a nonzero 
transverse component Q±^ of the scattering vector, the 
width of the peak is also affected by the transverse struc- 
tural disorder due to the effect of the sequence (variables 
rjj ) and by their correlations with the longitudinal struc- 
tural disorder (variables \j ) , measured by the coefficient 
X in Eq. ([3]). The statistical properties of A, and rjj have 
been obtained by conformational analysis p2| but their 
correlations have not been determined. We show results 
with X — ^ (no correlation) and x — 0-35 correspond- 
ing to moderate correlations. Moreover scan 2, is also 
probing the transverse fluctuations of the bases prior to 
opening. The inset in Fig. llOI d shows that, in the vicin- 
ity of the transition, these fluctuations are expected to 
cause an extra increase of the width of the Bragg peak, 
for scan 2 only. However this effect is small because the 
Bragg peak is only generated by the closed sections of 
the DNA molecules, where the fluctuations are therefore 
limited to small amplitude motions of the bases. 



IV. DISCUSSION 

To allow a quantitative comparison of the experimen- 
tal and theoretical results we plot the data as a function 
of a reduced temperature 9 = T /T^ where Tc is the tem- 
perature where 50% of the bases are open. This is done 
in Fig. [TT] To eliminate the experimental factor asso- 
ciated to the apparatus, in this figure the experimental 
intensity of the Bragg peak has been rescaled to 1 in the 
low 9 limit, which is also the limit of the theoretical in- 
tensity in the low temperature range. The widths have 
been multiplied by the mean separation of the base pairs, 
a, to create dimensionless variables. Therefore a quanti- 
tative comparison between theoretical and experimental 
widths is possible and it does not involve any arbitrary 
factor. 

Let us first examine the integrated intensity of the 
Bragg peak. Both the experimental data and the theoret- 
ical curve show an intensity which stays almost constant 
up to temperatures very close Tc, 9 « 0.97. As previ- 
ously noted this refiects the very low fraction of open 
base pairs until the vicinity of the transition is reached, 
in agreement with the theoretical result of Fig. [TUla. In 
the early stage of the transition the theoretical curve fol- 
lows the experimental decay of the intensity of the peak, 
but in the immediate vicinity of Tc {9 ~ 1) the exper- 
imental intensity shows an almost discontinuous drop, 
while the theoretical curve has a narrow but smooth de- 
cay. This discrepancy is a sign that another phenomenon, 
not included in the theoretical description happens. As 
discussed in Sec. [Tl] the optical observations of a sample 
during heating indicate that, at high temperature, the 
film itself shows an irreversible "collapse" characterised 
by a disorganisation of the oriented fibre structure. Al- 
though this is not described by the model, the theoretical 
analysis shines some light on this phenomenon. Figure 
[TUl c shows that, above Tc, the length of the denatured 
regions grows very quickly with temperature, showing al- 
most a divergence. As the single strands are very fiexible 
they can gain a lot of entropy by fully losing their initial 
orientation so that the remaining double-helix segments 
are embedded in a liquid-like medium of entangled sin- 
gle strands which quickly becomes the dominant phase 
in the sample. The rapid growth of the size of the open 
fragments gives a lot of freedom to the rigid double-helix 
segments allowing them to lose their spatial orientation, 
which causes the sharp drop of the intensity of the Bragg 
peak. At higher temperatures the sample is no longer 
a good approximation of a one-dimensional crystal, but 
is more an ensemble of disoriented, short length, DNA 
This temperature range corresponds to the shaded area 
in Figs. [TU] and [TTJ It is interesting to notice that, when 
this "collapse" of the film occurs, the theory predicts that 
the size of the closed segments is still large, of the order of 
80 — 100 base pairs. This is indirectly confirmed by the 
electrophoresis analysis of the length of the DNA frag- 
ments in the film before and after heating which indicate 
that, after film melting, the DNA molecules that were 
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FIG. 10. Theoretical results, (a) Melting curve of a reference DNA segment of 280000 base pairs, part of the genome of 
Pyrococcus abyssi: the circles show the open fraction versus temperature and the stars correspond to its derivative to get the 
melting profile, (b) Probabilities P{m) versus m at different temperatures in the range T = 330 K to T = 375 K, in logarithmic 
scale. The points are the values calculated from the statistical mechanics of the DNA sample, and the lines show a linear fit 
for 40 < m < 150. Curves are plotted every 10 K from 300 to 340 K and every 1.5 K above, (c) The average size the closed 
clusters Ic (full line) and the average size of the open regions lo (dotted line) versus T. The dashed line shows the melting 
profile on the same temperature scale, (d) Integrated intensity (thin full line) and width (thick full line for scan 1, dotted and 
dashed lines for scan 2) versus temperature. For scan 2 the figure shows two cases: x = Oi ignoring the correlation between 
longitudinal Xj and transverse r/j components of the structural disorder due to the sequence (dotted line) and x = 0.35 (dashed 
line) which assumes a partial correlation. The shaded area of the plot shows the temperature range in which the experimental 
observations are hindered by the melting of the sample film (Sec. IIV|l . The inset shows a magnification of the variation of the 
widths versus temperature in the immediate vicinity of the transition. 
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FIG. 11. Comparison between theory and experiment. The 
points are the experimental results while the curves plot the 
theoretical results. A reduced temperature 6 — T/Tc is used, 
where Tc is temperature where 50% of the bases are open. 
The circles show the integrated intensity of the Bragg peak, 
reseated to 1 at low temperature. After reseating, the results 
for scans 1 and 2 exactly follow the same curve. The thin line 
is the calculated integrated intensity. The open squares show 
the experimental width of scan 1 and the thick full line is 
the theoretical value for this width. The filled squares show 
the width of scan 2. The dotted line shows the theoretical 
width of scan 2 calculated by assuming that the longitudi- 
nal and transverse structural disorder due to the sequence, 
determined by conformational analysis 122], are uncorrelated 
(x — 0) while the dash line is the theoretical width of scan 2 
calculated by assuming a partial correlation between the lon- 
gitudinal and transverse structural disorder {x = 0.35). The 
shaded area of the plot shows the temperature range in which 
the experimental observations are hindered by the collapse of 
the sample film. The figure is identical to Fig. 3 p^ . 

more than 20000 base-pair long at low temperature are 
chopped into segments of a few hundreds of base pairs. 
This breaking can be understood by the high stress con- 
centration that occurs at the end of the rigid fragments, 
linked to each other by the flexible single strands, when 
they rotate as the film melts. It is therefore not surpris- 
ing to detect DNA fragments which have a length of the 
order of the size of the closed clusters. 

Let us now examine the width of the Bragg peak. For 
scan 1 the calculation of the width does not involve any 
free parameter once the model has been calibrated to 
match the denaturation curve of DNA. The other param- 
eters entering in the structure factor calculation (Eqs. ([S]) 
and ([TB)) ) are derived from the structure of DNA [23 and 
its sound velocity measured along the helix axis [23| . For 
scan 1 {Q± = 0), the calculation gives a result which is in 
good agreement with experiments (Fig. ITU) , although the 
experimental results are probably affected by some an- 
nealing of the sample causing a slight decay of the width 
of the peak (Fig.S]) that the theory does not describe. In 



spite of this limitation two points emerge from the com- 
parison between theory and experiments. First the struc- 
tural data which enter in the calculation of the width for 
scan 1, and particularly the fluctuations of the base-pair 
distances along the helix, measured by (A^), correspond- 
ing to a standard deviation of 0.18 A, are here tested on a 
large scale since the width of the diffraction peak involves 
an average over the billions of base pairs of the sample. 
The discrepancy of less than 15% between the calculated 
and experimental widths indicates that the results of the 
conformational analysis [1^ are accurate. Second, for 
scan 1, the width of the peak is remarkably constant un- 
til Tc and the collapse of the film. This indicates that 
the base pair openings, which start to be very significant 
dX 9 = T/Tc = 0.98 do not cause a sharp decrease of the 
size of the diffracting clusters until the denaturation has 
occurred. This is what the theoretical model indicates 
(Fig[TU]-c). Clusters of about 100 base pairs remain in- 
tact well within the denaturation region and denature as 
a whole, this being allowed by the surrounding open bub- 
bles in the last stage of the denaturation. This provides 
a good test of the statistical physics description of DNA 
that we use, validating the model beyond its ability to 
predict melting curves. 

Contrary to scan 1 the calculation of the the width of 
the Bragg peak in scan 2, with a transverse component 
of the scattering vector, involves an unknown param- 
eter, the coefficient x that measures the correlation be- 
tween the longitudinal and transverse structural disorder 
due to the sequence (Eq. Figure [TT] shows that, if we 
ignore this correlation by setting x = 0, the theoretical 
value is about 30% lower than the experimental width. 
Setting X = 0.35, i.e. a moderate correlation, we get a 
theoretical width which matches the experimental value 
for K, 0.97 which we take as a temperature where the 
sample is "annealed". The results suggest that neutron 
scattering could be used to probe this structural property 
of DNA and it would be interesting to test this result by 
conformational analysis. 

In the range 0.99 < 6 < 1 the experiment detects a 
significant rise of the width of scan 2 which is not shown 
by the theoretical curve. According to Eq. the trans- 
verse fluctuations of the base pairs, which become large 
because they start to open, bring an extra contribution 
to the width through a growth of (^^}. This contribu- 
tion is visible on the inset of Fig. [TUl but, as discussed in 
Sec. llin this effect has to be small since only fluctuations 
in the closed clusters of base pairs can contribute to the 
shape of the Bragg peak. This is not enough to account 
for the observed increase of the width of scan 2 below 
the transition. However there is another contribution to 
the width which is not included in the structure factor 
calculation, it is the misalignment of the molecules. It 
is very likely that the collapse of the film is preceded by 
increased fluctuations in the orientation of the helix frag- 
ments. This should have a strong influence on the width 
of the Bragg peak in scan 2. For instance orientational 
fiuctuations of 10 degrees which change the projection of 
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the base pair distance on Q|| by less than 2% lead to a 
projection of 17% of this distance along Q±. Only a the- 
ory of the collective effects leading to the melting of the 
film could properly account for this effect. 

V. CONCLUSION 

In conclusion, we have shown that neutron scatter- 
ing can be used to monitor the thermal denaturation of 
DNA, providing the spatial information that other meth- 
ods cannot measure. By focusing the study on the Bragg 
peak which is associated to the base pair stacking we can 
obtain accurate results which are not limited by the fi- 
bre nature of the samples. The width of the Bragg peak 
can be described by a simple nonlinear model for DNA 
at the scale of base pairs, thus providing further vali- 
dation of this model which had already proved able to 
predict complex DNA denaturation curves with a small 
number of parameters. Moreover, by selecting a scatter- 



ing vector which is not parallel to the axis of the DNA 
helix, the shape of the Bragg peak is also sensitive to the 
transverse fluctuations of the base pairs, that this dy- 
namical model can calculate. This should allow further 
comparison between theory and experiments by investi- 
gating not only the opening of the base pairs, but also 
their large scale fluctuations, important in many biolog- 
ical processes. This aspect could not be investigated in 
the present experiments due to the collapse of the fibre 
structure of the sample. 
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